Welcome to the second part of our TimescaleDB best practices series! In the first part, we explored how to perform massive backfill operations efficiently, sharing techniques to optimize performance and avoid common pitfalls. If you haven’t had a chance to read the first part yet, you can check it out using this link
In today’s blog, we will discuss another crucial aspect of time-series data management: massive delete operations.
As your data grows over time, older records often lose their relevance but continue to occupy valuable disk space, potentially increasing storage costs and might degrade the performance if not managed well.
Let’s walk through some strategies to clean up or downsample aged data in TimescaleDB, helping you maintain a lean, efficient, and cost-effective database.
Prerequisites for Massive Delete Operations
Here are a few important steps to follow for performing a large-scale delete on production and to ensure we are prepared in case something goes wrong.
Tune Autovacuum Settings
In PostgreSQL, VACUUM is a maintenance process that removes dead tuples, obsolete row versions left behind by UPDATE or DELETE operations. These dead tuples occupy space but are no longer visible to any active transactions. Vacuuming reclaims this space, helping to reduce table bloat and maintain database performance.
The autovacuum feature automates this process by periodically running in the background, ensuring that dead tuples are cleaned up without manual intervention. This is especially important after large delete operations, where a significant number of dead rows can accumulate. If not handled promptly, this can lead to table bloat, increased I/O, and slower query performance.
However, its effectiveness depends heavily on how well it is configured. Without proper tuning, autovacuum may run too infrequently or too slowly, allowing dead tuples to pile up and impact performance.
Here is a list of important autovacuum parameters along with their recommended values that can be used to tune autovacuum effectively.
autovacuum
- Description: Enables or disables the autovacuum daemon.
- Default: on
- Recommendation: on
autovacuum_max_workers
- Description: Specifies the maximum number of autovacuum worker processes.
- Default: 3
- Recommendation: CPU Cores / 8, but minimum of 3
- Example: If PostgreSQL is allocated 56 CPU cores, the value should be 56/8 = 7
autovacuum_naptime
- Description: Time interval between autovacuum runs.
- Default: 1min
- Recommendation: 30s
autovacuum_vacuum_scale_factor
- Description: The Fraction of table size that determines the additional number of updated or deleted tuples needed to trigger a vacuum.
- Default: 0.2
- Recommendation: 0.05
autovacuum_analyze_scale_factor
- Description: Fraction of table size that determines the additional number of updated or inserted tuples needed to trigger an analyze.
- Default: 0.1
- Recommendation: 0.05
autovacuum_vacuum_cost_delay
- Description: Cost delay in milliseconds between vacuum operations performed by autovacuum workers.
- Default: 20ms
- Recommendation: 100ms
autovacuum_vacuum_cost_limit
- Description: Cost limit for vacuum operations performed by autovacuum workers.
- Default: -1 (uses vacuum_cost_limit)
- Recommendation: 2000
log_autovacuum_min_duration
- Description: Logs autovacuum actions if they take longer than the specified duration (in milliseconds).
- Default: -1 (disabled)
- Recommendation: 1000ms
autovacuum_work_mem
- Description: Memory to be used by autovacuum workers for each table they process.
- Default: -1 (uses maintenance_work_mem)
- Recommendation: 20% of system memory / autovacuum_max_workers
- Example: If PostgreSQL is allocated 300GB of memory and autovacuum_max_workers is 10, then this value should be 20% of 300 / 10 = 6 GB.
Below are a couple of blogs where you can get additional details on the autovacuum process.
- Leveraging autovacuum in PostgreSQL to optimize performance and reduce costs
- VACUUM FULL in PostgreSQL – What you need to be mindful of
There are a few additional prerequisites to consider, which we already discussed in Part 1 of this series. For a complete understanding, please refer to the following sections from that post.
- Back Up Production Before Performing a Massive Delete
- Always Test on Staging Before Pushing to Production
- Select a Time with Low Customer Activity
- Advanced Customer Notification for Database Maintenance
Once all prerequisites are met, we can use the following approaches to perform the massive delete operation.
Methods to Perform Massive Delete Operations
In this section, we will cover a few key methods for performing a massive delete operation in a production environment. However, it is strongly advised to first test any chosen method in the staging environment before applying it to the production system.
Method 1: Use the Data Retention policy from TimescaleDB
To perform a large-scale delete, Timescale recommends using a data retention policy. Data retention helps you save on storage costs by deleting old data automatically.
Data retention policy means it automatically drops data once its time value ages past a certain interval. When we create a data retention policy, Timescale automatically schedules a background job to drop old chunks.
This is how we can add the data retention policy
SELECT add_retention_policy('test_table', INTERVAL '10 months');
In the example above, we are creating a data retention policy on the hypertable named test_table, specifying that any data older than 10 months should be automatically deleted.
Use this link to learn more about the add_retension_policy function.
Method 1.1: Downsample Data Using Retention Policies and Continuous Aggregates
Older and historical data is often retained to support future analytical needs, even if it’s unclear when or how that data might be used. As a result, this data is typically not deleted, which leads to increased storage consumption over time.
To address this, Timescale provides a solution that allows data to be deleted while still remaining available in an aggregated form for historical analysis. This can be achieved by combining a retention policy with a continuous aggregate.
It means we can delete old data from a hypertable without deleting it from any continuous aggregates. This approach lets us save on raw data storage while keeping summarized data for historical analysis.
When using this approach, it’s crucial to carefully configure the refresh interval of the continuous aggregate in relation to the data retention policy.
For example, if you set the data retention policy to delete data older than 1 day, and the continuous aggregate is also configured to refresh every 1 day, there’s a risk that the raw data needed for the refresh will already be deleted by the time the refresh occurs. As a result, the continuous aggregate may not have access to the necessary data, rendering it incomplete or useless.
To avoid this, always ensure that the data retention period is longer than the refresh window of the continuous aggregate. This allows the system enough time to update the aggregate before the raw data is removed.
Use this link to learn more about the data retention and continuous aggregate combination for downsampling the data, and how to properly configure your refresh interval for continuous aggregate
Method 2: Manually Dropping Chunks in TimescaleDB
If you prefer more control over the deletion process and would rather manage it manually instead of using an automated data retention policy, this approach may be more suitable for you. It allows you to select and delete chunks based on your specific needs and requirements.
From a performance perspective, deleting data row-by-row with the PostgreSQL DELETE command can be slow. But dropping data by the chunk is faster, because it deletes an entire file from disk. It doesn’t need garbage collection and defragmentation.
You can use the following command to manually delete a chunk:
SELECT drop_chunks('test_table', INTERVAL '24 hours');
This will drop chunks with data older than 24 hours from test_table hypertable
Use this link to learn more about deleting the chunks manually in TimescaleDB
Method 3: Transfer Selected Data to a New Hypertable
This method is especially useful in scenarios where you need to delete a significant portion of data from a hypertable, typically when a large percentage of the data is no longer needed. Rather than deleting rows individually or relying on retention policies, this approach allows for a more controlled and efficient cleanup.
Here are the steps involved
- Define a new hypertable with the same schema, including all necessary columns, indexes, and constraints.
- Insert/Copy only the relevant or required subset of data from the original hypertable into the new one. This allows you to discard unwanted data during the copy process.
- Perform sanity checks or data validation to ensure that all essential data has been accurately transferred to the new hypertable.
- Once validation is complete and you are confident in the data integrity, delete the original hypertable to reclaim storage.
This method is most effective when you can clearly define what data needs to be retained, and when a large portion of the existing data is no longer necessary. It provides a clean break and helps reduce bloat in cases where traditional deletion methods may be inefficient or slow.
What to Do After Performing a Massive Delete Operation
After performing a large-scale delete operation, it is recommended to run VACUUM FULL on the hypertable to reclaim disk space by removing dead tuples and returning the freed space to the operating system.
Here is how you can run
VACUUM FULL test_table
Where test_table is the hypertable name
Conclusion
Finally, we recommend following the community-established guidelines for performing large-scale delete operations on hypertables. Additionally, ensure that your autovacuum settings are properly tuned to efficiently handle dead tuples and reclaim storage without issues